ABSTRACT
INTRODUCTION: Preprints have been widely cited during the COVID-19 pandemics, even in the major medical journals. However, since subsequent publication of preprint is not always mentioned in preprint repositories, some may be inappropriately cited or quoted. Our objectives were to assess the reliability of preprint citations in articles on COVID-19, to the rate of publication of preprints cited in these articles and to compare, if relevant, the content of the preprints to their published version. METHODS: Articles published on COVID in 2020 in the BMJ, The Lancet, the JAMA and the NEJM were manually screened to identify all articles citing at least one preprint from medRxiv. We searched PubMed, Google and Google Scholar to assess if the preprint had been published in a peer-reviewed journal, and when. Published articles were screened to assess if the title, data or conclusions were identical to the preprint version. RESULTS: Among the 205 research articles on COVID published by the four major medical journals in 2020, 60 (29.3%) cited at least one medRxiv preprint. Among the 182 preprints cited, 124 were published in a peer-reviewed journal, with 51 (41.1%) before the citing article was published online and 73 (58.9%) later. There were differences in the title, the data or the conclusion between the preprint cited and the published version for nearly half of them. MedRxiv did not mentioned the publication for 53 (42.7%) of preprints. CONCLUSIONS: More than a quarter of preprints citations were inappropriate since preprints were in fact already published at the time of publication of the citing article, often with a different content. Authors and editors should check the accuracy of the citations and of the quotations of preprints before publishing manuscripts that cite them.
Subject(s)
COVID-19 , Periodicals as Topic , COVID-19/epidemiology , Humans , Peer Review , PubMed , Reproducibility of ResultsABSTRACT
Clinical Data Warehouses (CDW) are gold mines and may be useful to manage the COVID-19 outbreak. This article details the use of CDW in order to retrieve patients for vaccination purposes. A list of 34 diseases (or conditions) was published by French Health Authorities to target individuals at a high risk of developing a severe form of COVID. Using a multilevel search engine, 23 queries were built based on structured or unstructured data using natural language processing features. The Diagnosis Related Group coding system was used alone in three queries (13.0%), coupled with unstructured data in four queries (17.4%), and unstructured data were used alone in 16 queries (69.6%). Eleven diseases (conditions) were too broad to be translated into queries. Finally, 6,006 unique re-identified patients were retrieved. This use case demonstrates the usefulness of the Rouen University Hospital CDW in retrieving patients for other purposes than translational research.